17 research outputs found

    Centralized scientific communities are less likely to generate replicable results

    Get PDF
    Concerns have been expressed about the robustness of experimental findings in several areas of science, but these matters have not been evaluated at scale. Here we identify a large sample of published drug-gene interaction claims curated in the Comparative Toxicogenomics Database (for example, benzo(a)pyrene decreases expression of SLC22A3) and evaluate these claims by connecting them with high-throughput experiments from the LINCS L1000 program. Our sample included 60,159 supporting findings and 4253 opposing findings about 51,292 drug-gene interaction claims in 3363 scientific articles. We show that claims reported in a single paper replicate 19.0% (95% confidence interval [CI], 16.9–21.2%) more frequently than expected, while claims reported in multiple papers replicate 45.5% (95% CI, 21.8–74.2%) more frequently than expected. We also analyze the subsample of interactions with two or more published findings (2493 claims; 6272 supporting findings; 339 opposing findings; 1282 research articles), and show that centralized scientific communities, which use similar methods and involve shared authors who contribute to many articles, propagate less replicable claims than decentralized communities, which use more diverse methods and contain more independent teams. Our findings suggest how policies that foster decentralized collaboration will increase the robustness of scientific findings in biomedical research

    Evaluation of Data Sharing After Implementation of the International Committee of Medical Journal Editors Data Sharing Statement Requirement.

    Get PDF
    Importance The benefits of responsible sharing of individual-participant data (IPD) from clinical studies are well recognized, but stakeholders often disagree on how to align those benefits with privacy risks, costs, and incentives for clinical trialists and sponsors. The International Committee of Medical Journal Editors (ICMJE) required a data sharing statement (DSS) from submissions reporting clinical trials effective July 1, 2018. The required DSSs provide a window into current data sharing rates, practices, and norms among trialists and sponsors. Objective To evaluate the implementation of the ICMJE DSS requirement in 3 leading medical journals: JAMA, Lancet, and New England Journal of Medicine (NEJM).Design, setting, and participantsThis is a cross-sectional study of clinical trial reports published as articles in JAMA, Lancet, and NEJM between July 1, 2018, and April 4, 2020. Articles not eligible for DSS, including observational studies and letters or correspondence, were excluded. A MEDLINE/PubMed search identified 487 eligible clinical trials in JAMA (112 trials), Lancet (147 trials), and NEJM (228 trials). Two reviewers evaluated each of the 487 articles independently. Exposure Publication of clinical trial reports in an ICMJE medical journal requiring a DSS. Main outcomes and measures The primary outcomes of the study were declared data availability and actual data availability in repositories. Other captured outcomes were data type, access, and conditions and reasons for data availability or unavailability. Associations with funding sources were examined. Results A total of 334 of 487 articles (68.6%; 95% CI, 64%-73%) declared data sharing, with nonindustry NIH-funded trials exhibiting the highest rates of declared data sharing (89%; 95% CI, 80%-98%) and industry-funded trials the lowest (61%; 95% CI, 54%-68%). However, only 2 IPD sets (0.6%; 95% CI, 0.0%-1.5%) were actually deidentified and publicly available as of April 10, 2020. The remaining were supposedly accessible via request to authors (143 of 334 articles [42.8%]), repository (89 of 334 articles [26.6%]), and company (78 of 334 articles [23.4%]). Among the 89 articles declaring that IPD would be stored in repositories, only 17 (19.1%) deposited data, mostly because of embargo and regulatory approval. Embargo was set in 47.3% of data-sharing articles (158 of 334), and in half of them the period exceeded 1 year or was unspecified. Conclusions and relevance Most trials published in JAMA, Lancet, and NEJM after the implementation of the ICMJE policy declared their intent to make clinical data available. However, a wide gap between declared and actual data sharing exists. To improve transparency and data reuse, journals should promote the use of unique pointers to data set location and standardized choices for embargo periods and access requirements

    The worldwide clinical trial research response to the COVID-19 pandemic - the first 100 days

    Get PDF
    Background: Never before have clinical trials drawn as much public attention as those testing interventions for COVID-19. We aimed to describe the worldwide COVID-19 clinical research response and its evolution over the first 100 days of the pandemic. Methods: Descriptive analysis of planned, ongoing or completed trials by April 9, 2020 testing any intervention to treat or prevent COVID-19, systematically identified in trial registries, preprint servers, and literature databases. A survey was conducted of all trials to assess their recruitment status up to July 6, 2020. Results: Most of the 689 trials (overall target sample size 396,366) were small (median sample size 120; interquartile range [IQR] 60-300) but randomized (75.8%; n=522) and were often conducted in China (51.1%; n=352) or the USA (11%; n=76). 525 trials (76.2%) planned to include 155,571 hospitalized patients, and 25 (3.6%) planned to include 96,821 health-care workers. Treatments were evaluated in 607 trials (88.1%), frequently antivirals (n=144) or antimalarials (n=112); 78 trials (11.3%) focused on prevention, including 14 vaccine trials. No trial investigated social distancing. Interventions tested in 11 trials with >5,000 participants were also tested in 169 smaller trials (median sample size 273; IQR 90-700). Hydroxychloroquine alone was investigated in 110 trials. While 414 trials (60.0%) expected completion in 2020, only 35 trials (4.1%; 3,071 participants) were completed by July 6. Of 112 trials with detailed recruitment information, 55 had recruited <20% of the targeted sample; 27 between 20-50%; and 30 over 50% (median 14.8% [IQR 2.0-62.0%]). Conclusions: The size and speed of the COVID-19 clinical trials agenda is unprecedented. However, most trials were small investigating a small fraction of treatment options. The feasibility of this research agenda is questionable, and many trials may end in futility, wasting research resources. Much better coordination is needed to respond to global health threats

    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

    Full text link
    Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License

    Spatial network structures of world migration: heterogeneity of global and local connectivity

    No full text
    The landscape of world migration involves multiple interacting movements of people at various geographic scales, posing significant challenges to the dyadic-independence assumption underlying standard migration models. To account for emerging patterns of multilateral migration relationships, we represent world migration as a time-evolving, spatial network. The nodes in the World Migration Network (WMN) are countries located in geographic space, and the edges represent migratory movements for each decade from 1960-2000. In the first part of the thesis, we characterise the spatial network structure of the WMN, with a particular focus on detecting and mapping mesoscopic structures called 'communities' (i.e., sets of countries with denser migration connections internally than to the rest of the WMN). We employ a method for community detection that simultaneously accounts for multilateral migration, spatial constraints, time-dependence, and directionality in the WMN. We then introduce an approach for characterising local (intracommunity) and global (intercommunity) connectivity in the WMN. On this basis, we define a threefold typology that distinguishes 'cave', 'bi-regional', and 'bridging' communities. These are characterised with distinct migration patterns, spatial network structures, and temporal dynamics: cave communities are tightly-knit enduring structures that channel local migration between contiguous countries; bi-regional communities merge migration between two distinct geographic regions; bridging communities have hub-and-spoke dynamic structures that emerge from globe-spanning movements. Our results suggest that the WMN is neither a globally interconnected network nor reproducing geographic boundaries but involves heterogeneous patterns of global and local ('glocal') migration connectivity. We examine a set of relational, homophily, and spatial mechanisms that could have possibly generated the 'glocal' structure we observe. We found that communities of different types arise from significantly different mechanisms. Our results suggest that migration communities can have important implications for world migration, as different types of community structure provide distinct opportunities and constraints, thereby distinctively shaping future migration patterns.</p

    Spatial network structures of world migration: heterogeneity of global and local connectivity

    No full text
    The landscape of world migration involves multiple interacting movements of people at various geographic scales, posing significant challenges to the dyadic-independence assumption underlying standard migration models. To account for emerging patterns of multilateral migration relationships, we represent world migration as a time-evolving, spatial network. The nodes in the World Migration Network (WMN) are countries located in geographic space, and the edges represent migratory movements for each decade from 1960-2000. In the first part of the thesis, we characterise the spatial network structure of the WMN, with a particular focus on detecting and mapping mesoscopic structures called 'communities' (i.e., sets of countries with denser migration connections internally than to the rest of the WMN). We employ a method for community detection that simultaneously accounts for multilateral migration, spatial constraints, time-dependence, and directionality in the WMN. We then introduce an approach for characterising local (intracommunity) and global (intercommunity) connectivity in the WMN. On this basis, we define a threefold typology that distinguishes 'cave', 'bi-regional', and 'bridging' communities. These are characterised with distinct migration patterns, spatial network structures, and temporal dynamics: cave communities are tightly-knit enduring structures that channel local migration between contiguous countries; bi-regional communities merge migration between two distinct geographic regions; bridging communities have hub-and-spoke dynamic structures that emerge from globe-spanning movements. Our results suggest that the WMN is neither a globally interconnected network nor reproducing geographic boundaries but involves heterogeneous patterns of global and local ('glocal') migration connectivity. We examine a set of relational, homophily, and spatial mechanisms that could have possibly generated the 'glocal' structure we observe. We found that communities of different types arise from significantly different mechanisms. Our results suggest that migration communities can have important implications for world migration, as different types of community structure provide distinct opportunities and constraints, thereby distinctively shaping future migration patterns.</p
    corecore